Logistic regression of family data from retrospective study designs.
نویسندگان
چکیده
We wish to study the effects of genetic and environmental factors on disease risk, using data from families ascertained because they contain multiple cases of the disease. To do so, we must account for the way participants were ascertained, and for within-family correlations in both disease occurrences and covariates. We model the joint probability distribution of the covariates of ascertained family members, given family disease occurrence and pedigree structure. We describe two such covariate models: the random effects model and the marginal model. Both models assume a logistic form for the distribution of one person's covariates that involves a vector beta of regression parameters. The components of beta in the two models have different interpretations, and they differ in magnitude when the covariates are correlated within families. We describe ascertainment assumptions needed to estimate consistently the parameters beta(RE) in the random effects model and the parameters beta(M) in the marginal model. Under the ascertainment assumptions for the random effects model, we show that conditional logistic regression (CLR) of matched family data gives a consistent estimate beta(RE) for beta(RE) and a consistent estimate for the covariance matrix of beta(RE). Under the ascertainment assumptions for the marginal model, we show that unconditional logistic regression (ULR) gives a consistent estimate for beta(M), and we give a consistent estimator for its covariance matrix. The random effects/CLR approach is simple to use and to interpret, but it can use data only from families containing both affected and unaffected members. The marginal/ULR approach uses data from all individuals, but its variance estimates require special computations. A C program to compute these variance estimates is available at http://www.stanford.edu/dept/HRP/epidemiology. We illustrate these pros and cons by application to data on the effects of parity on ovarian cancer risk in mother/daughter pairs, and use simulations to study the performance of the estimates.
منابع مشابه
Comparison of Random Forest and Logistic Regression Methods in Predicting Mortality in Colorectal Cancer Patients and its Related Factors
Background and Objectives: The purpose of this study was to predict the mortality rate of colorectal cancer in Iranian patients and determine the effective factors on the mortality of patients with colorectal cancer using random forest and logistic regression methods. Methods: Data from 304 patients with colorectal cancer registry from the Gastroenterology and Liver Research Center of Shah...
متن کاملDetermining the factors related to diabetes type II with mixed logistic regression
Background and aims: Diabetes type II (non-insulin dependent) which is one of the most prevalent diabetes types in the world emerges in people with the age of above 55 and genetic and environmental factors interfere in this disease. The aim of this study was to determine the factors affecting diabetes type II with generalized mixed linear model. Methods: ...
متن کاملRetrospective-prospective symmetry in the likelihood and Bayesian analysis of case-control studies
Prentice & Pyke (1979) established that the maximum likelihood estimate of an odds ratio in a case-control study is the same as would be found by fitting a logistic regression; in other words, for this specific target the incorrect prospective model is inferentially equivalent to the correct retrospective model. Similar results have been obtained for other models, and conditions have also been ...
متن کاملThe Association between Socio – economic Factors and Coronary Artery Disease in Yazd Province: a case - control Study
Introduction: One of the strongest and most consistent predictors of morbidity and mortality of an individual is socio -economic status (SES) and coronary artery disease (CAD) is maybe the most prominent disturbance caused by socio-economic inequality. To investigate the relation between socio - economic factors with CAD this study was conducted in Yazd province. Materials and Methods:This retr...
متن کاملModeling the Risk factors of hypertension in 35-65 years old individuals using logistic regression
Introduction Hypertension is a common cause of cardiovascular disease in the world. Therefore identification of risk factors for hypertension is essential to carry out preventive masseurs. So this study was done with the aim of using logistic regression model to determine and assess the risk factors of hypertension, in Mashhad. Materials & Methods This Cross sectional study was carried out us...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genetic epidemiology
دوره 25 3 شماره
صفحات -
تاریخ انتشار 2003